Decomposition & Moving Averages

Lecture 3

What is a moving average, and how does it smooth a time series?

The moving average smooths out short-run noise.

A k-period moving average averages the nearest k observations.

A 3-MA at time t: ⅓(y_t−1 + y_t + y_t+1)
Each point in the smoothed series is a local average of the raw data.

Larger k → smoother trend, but more data lost at the ends.

A 3-MA loses 1 point at each end; a 7-MA loses 3.
Too small: still noisy. Too large: over-smoothed, misses genuine turns.

Odd-order MAs are symmetric; even-order MAs are not.

A 4-MA is centered between two time points, so it needs a further 2-MA to align with the original series.

Even-period seasonality requires a 2×m-MA to estimate trend.

For quarterly data (m = 4), a 2×4-MA first takes a 4-MA, then averages consecutive pairs. For monthly data (m = 12), a 2×12-MA does the same.

The result is a symmetric, centered moving average that assigns equal weight to all seasons — so the seasonal pattern cancels out and only the trend remains.

T̂_t = ⅛y_t−2 + ¼y_t−1 + ¼y_t + ¼y_t+1 + ⅛y_t+2 (2×4-MA)

This is the trend estimate used in classical decomposition.

How does classical decomposition estimate each component?

Classical decomposition: step by step

Step 1. Estimate the trend T_t using a 2×m-MA.

This removes the seasonal and most irregular variation.

Step 2. De-trend the series.

Additive: y_t − T̂_t | Multiplicative: y_t / T̂_t

Step 3. Estimate the seasonal component S_t.

Average the de-trended values for each season (e.g., all Januaries, all Februaries…).
Adjust so the seasonal indices sum to zero (additive) or to m (multiplicative).

Step 4. Compute the remainder R_t.

Additive: R_t = y_t − T̂_t − Ŝ_t

Classical decomposition has several important limitations.

Trend not estimated at the ends. The moving average requires observations on both sides, so the first and last m/2 points have no trend estimate.
Seasonal component assumed constant. It forces the same seasonal index for every year; a shifting seasonal pattern is missed entirely.
Sensitive to outliers. An extreme value pulls the moving average and distorts the trend estimate.
No formal uncertainty. There are no standard errors or prediction intervals for the components.

These limitations motivate more modern methods: X-11, SEATS, and STL.

When and how should you transform a time series before modelling?

The Box-Cox family generalizes common variance-stabilizing transformations.

The Box-Cox transformation with parameter λ:

w_t = (y_t^λ − 1) / λ (λ ≠ 0) ; w_t = log(y_t) (λ = 0)

λ	Transformation	When useful
λ = 1	None (identity)	Variance already constant
λ = ½	Square root	Counts, mild heteroskedasticity
λ = 0	Natural log	Variance grows with level (most common)
λ = −1	Reciprocal	Strongly growing variance

The Guerrero method chooses λ automatically.

Rather than guessing λ, let the data choose it. The Guerrero method selects the value of λ that minimizes the coefficient of variation of sub-series means — i.e., it makes the variation as constant as possible across the series.

In R (fpp3):

          lambda <- data |> features(variable, guerrero)

          data |> autoplot(box_cox(variable, lambda))

Practical rule: if the chosen λ is close to 0, use a log. If it is close to 1, no transformation is needed. Prefer simple, interpretable transformations over precise ones.

Forecasts must always be back-transformed to the original scale.

Modelling on log(y_t) produces forecasts on the log scale. The back-transformation is exp(ŷ_T+h), but this gives the median of the original-scale distribution, not the mean.

For the mean (which is usually preferred for minimizing squared error loss), a bias correction is needed:

E[y_T+h] ≈ exp(ŷ_T+h + ½σ̂_h²)

fpp3’s forecast() function applies bias correction automatically when bias_adjust = TRUE.

More advanced decomposition methods

X-11 — developed by the U.S. Census Bureau.

Iteratively applies weighted moving averages, allowing the seasonal pattern to evolve slowly over time.
Handles trading-day effects and moving holidays (e.g., Easter).
Used to produce official seasonally adjusted economic statistics.

SEATS — Signal Extraction in ARIMA Time Series.

Model-based approach: fits an ARIMA model, then extracts components from it.
Used by Eurostat and many central banks for official seasonal adjustment.

X-13-ARIMA-SEATS combines both methods.

Available in R via the seasonal package; X_13ARIMA_SEATS() in fpp3.

STL is the most flexible and widely applicable method.

Recall: STL = Seasonal and Trend decomposition using Loess. In fpp3:

          data |> model(STL(variable ~ trend(window=7) + season(window='periodic'))) |>

            components() |> autoplot()

Key controls:

trend(window) — width of the loess window for the trend. Larger = smoother trend.
season(window='periodic') — forces a constant seasonal pattern. Use a numeric window to allow it to evolve.
Robust option downweights outliers so they do not distort the trend or seasonal estimates.

Seasonally adjusted data removes the seasonal component.

Seasonally adjusted = original minus seasonal: y_t^SA = y_t − Ŝ_t (or y_t / Ŝ_t for multiplicative).

Seasonally adjusted series reveal the underlying trend and cycle without the calendar noise. This is why GDP, employment, and retail sales figures reported in the news are almost always seasonally adjusted.

Caution: seasonal adjustment is not always appropriate. For some decisions (e.g., staffing a ski resort), the seasonal pattern is the signal, not the noise.

Some variation in data is due to calendar effects, not genuine patterns.

Trading-day variation: months with more business days have higher production or sales. February always has fewer days than March. Adjusting for this prevents spurious seasonality.

Population adjustment: per-capita series are more comparable over time than totals when the population is growing. Always consider whether the raw total or a ratio is the right variable to model.

Inflation adjustment: nominal values can grow simply because of rising prices. Deflating by a price index converts nominal to real values before modelling.

Using decomposition to improve forecasts

Decompose, forecast components, recompose.

Forecast the seasonally adjusted series (trend + remainder) separately from the seasonal component.
Add the seasonal component back to get the final forecast.

In fpp3: decomposition_model().

Wraps a decomposition method and a sub-model for the seasonally adjusted series.
Example: STL + ETS(A,A,N) — STL removes seasonality, ETS forecasts the rest.

This is a practical benchmark for seasonal series.

Often outperforms naive or purely automatic approaches, especially at short horizons.

Decomposition workflow in fpp3

1. Check for non-constant variance → apply Box-Cox if needed.

Use guerrero feature or visual inspection to choose λ.

2. Decompose with STL (or X-13 for monthly/quarterly official data).

Tune trend(window) and season(window) if defaults look wrong.

3. Inspect the remainder component.

Should look like white noise. Large systematic remainders signal a poor decomposition.

4. Use seasonally adjusted series for further modelling or reporting.

components() |> select(season_adjust) in fpp3.

Practice Questions

Question 1 of 4